BE-429: HashQL: Add island dependency graph with data requirement resolution#8501
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
PR SummaryMedium Risk Overview Introduces a topological scheduler ( Supporting utilities were added in Reviewed by Cursor Bugbot for commit a4534ce. Bugbot is set up for automated code reviews on this repo. Configure here. |
🤖 Augment PR SummarySummary: Adds an island dependency graph and scheduler to the HashQL MIR execution engine, including automatic data requirement resolution across targets. Changes:
Technical Notes: Data island selection uses the first set origin target (backend-priority order), and schedule levels are derived from predecessor constraints to enable safe parallel execution. 🤖 Was this summary useful? React with 👍 or 👎 |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## bm/be-428-hashql-simplify-traversal-tracking-to-path-recording #8501 +/- ##
==================================================================================================
+ Coverage 62.57% 62.75% +0.18%
==================================================================================================
Files 1322 1326 +4
Lines 134824 135546 +722
Branches 5509 5527 +18
==================================================================================================
+ Hits 84366 85064 +698
- Misses 49547 49568 +21
- Partials 911 914 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Benchmark results
|
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2002 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1001 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 3314 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 1526 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 2078 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 1033 | Flame Graph |
policy_resolution_medium
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 102 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 51 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 269 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 107 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 133 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 63 | Flame Graph |
policy_resolution_none
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 2 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 8 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 3 | Flame Graph |
policy_resolution_small
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| resolve_policies_for_actor | user: empty, selectivity: high, policies: 52 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: empty, selectivity: medium, policies: 25 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: high, policies: 94 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: seeded, selectivity: medium, policies: 26 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: high, policies: 66 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: low, policies: 1 | Flame Graph | |
| resolve_policies_for_actor | user: system, selectivity: medium, policies: 29 | Flame Graph |
read_scaling_complete
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id;one_depth | 1 entities | Flame Graph | |
| entity_by_id;one_depth | 10 entities | Flame Graph | |
| entity_by_id;one_depth | 25 entities | Flame Graph | |
| entity_by_id;one_depth | 5 entities | Flame Graph | |
| entity_by_id;one_depth | 50 entities | Flame Graph | |
| entity_by_id;two_depth | 1 entities | Flame Graph | |
| entity_by_id;two_depth | 10 entities | Flame Graph | |
| entity_by_id;two_depth | 25 entities | Flame Graph | |
| entity_by_id;two_depth | 5 entities | Flame Graph | |
| entity_by_id;two_depth | 50 entities | Flame Graph | |
| entity_by_id;zero_depth | 1 entities | Flame Graph | |
| entity_by_id;zero_depth | 10 entities | Flame Graph | |
| entity_by_id;zero_depth | 25 entities | Flame Graph | |
| entity_by_id;zero_depth | 5 entities | Flame Graph | |
| entity_by_id;zero_depth | 50 entities | Flame Graph |
read_scaling_linkless
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | 1 entities | Flame Graph | |
| entity_by_id | 10 entities | Flame Graph | |
| entity_by_id | 100 entities | Flame Graph | |
| entity_by_id | 1000 entities | Flame Graph | |
| entity_by_id | 10000 entities | Flame Graph |
representative_read_entity
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1
|
Flame Graph | |
| entity_by_id | entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1
|
Flame Graph |
representative_read_entity_type
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| get_entity_type_by_id | Account ID: bf5a9ef5-dc3b-43cf-a291-6210c0321eba
|
Flame Graph |
representative_read_multiple_entities
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| entity_by_property | traversal_paths=0 | 0 | |
| entity_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| entity_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=0 | 0 | |
| link_by_source_by_property | traversal_paths=255 | 1,resolve_depths=inherit:1;values:255;properties:255;links:127;link_dests:126;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:0;link_dests:0;type:false | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:0;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:0;properties:2;links:1;link_dests:0;type:true | |
| link_by_source_by_property | traversal_paths=2 | 1,resolve_depths=inherit:0;values:2;properties:2;links:1;link_dests:0;type:true |
scenarios
| Function | Value | Mean | Flame graphs |
|---|---|---|---|
| full_test | query-limited | Flame Graph | |
| full_test | query-unlimited | Flame Graph | |
| linked_queries | query-limited | Flame Graph | |
| linked_queries | query-unlimited | Flame Graph |
572197c to
74ce977
Compare
e6439e0 to
2346ece
Compare
74ce977 to
b8fe72b
Compare
2346ece to
5e31131
Compare
|
Deployment failed with the following error: |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Bugbot Autofix prepared fixes for all 3 issues found in the latest run.
- ✅ Fixed: Cycles silently drop scheduled islands
- Added a full-node-count assertion in
schedule_inso cyclic control-flow now fails loudly instead of returning a partial schedule.
- Added a full-node-count assertion in
- ✅ Fixed: Range bound semantics are lost
- Updated
IdVec::copy_withinto preserveBound::{Included,Excluded,Unbounded}when converting typed ID bounds tousizebounds.
- Updated
- ✅ Fixed: Requirement resolution assumes acyclic control flow
- Replaced DFS-postorder-based ordering with Kahn topological ordering plus a cycle assertion in island requirement resolution.
Or push these changes by commenting:
@cursor push 27aca44a02
Preview (27aca44a02)
diff --git a/libs/@local/hashql/core/src/id/vec.rs b/libs/@local/hashql/core/src/id/vec.rs
--- a/libs/@local/hashql/core/src/id/vec.rs
+++ b/libs/@local/hashql/core/src/id/vec.rs
@@ -6,7 +6,7 @@
fmt::{self, Debug},
hash::{Hash, Hasher},
marker::PhantomData,
- ops::{Deref, DerefMut, RangeBounds},
+ ops::{Bound, Deref, DerefMut, RangeBounds},
slice,
};
@@ -510,8 +510,16 @@
where
T: Copy,
{
- let start = src.start_bound().copied().map(Id::as_usize);
- let end = src.end_bound().copied().map(Id::as_usize);
+ let start = match src.start_bound() {
+ Bound::Included(&bound) => Bound::Included(bound.as_usize()),
+ Bound::Excluded(&bound) => Bound::Excluded(bound.as_usize()),
+ Bound::Unbounded => Bound::Unbounded,
+ };
+ let end = match src.end_bound() {
+ Bound::Included(&bound) => Bound::Included(bound.as_usize()),
+ Bound::Excluded(&bound) => Bound::Excluded(bound.as_usize()),
+ Bound::Unbounded => Bound::Unbounded,
+ };
self.raw.copy_within((start, end), dst.as_usize());
}
@@ -796,3 +804,23 @@
Self::from_raw(Vec::from_iter_in(iter, alloc))
}
}
+
+#[cfg(test)]
+mod tests {
+ use super::IdVec;
+ use crate::id::{Id as _, newtype};
+
+ newtype!(#[id(crate = crate)] struct TestId(u32 is 0..=32));
+
+ #[test]
+ fn copy_within_preserves_inclusive_end_bound() {
+ let mut vec = IdVec::<TestId, i32>::from_raw(alloc::vec![10, 20, 30]);
+
+ vec.copy_within(
+ TestId::from_usize(1)..=TestId::from_usize(1),
+ TestId::from_usize(0),
+ );
+
+ assert_eq!(vec.as_slice().as_raw(), &[20, 20, 30]);
+ }
+}
diff --git a/libs/@local/hashql/mir/src/pass/execution/island/graph/mod.rs b/libs/@local/hashql/mir/src/pass/execution/island/graph/mod.rs
--- a/libs/@local/hashql/mir/src/pass/execution/island/graph/mod.rs
+++ b/libs/@local/hashql/mir/src/pass/execution/island/graph/mod.rs
@@ -20,7 +20,7 @@
#[cfg(test)]
pub(crate) mod tests;
-use alloc::alloc::Global;
+use alloc::{alloc::Global, collections::VecDeque};
use core::{
alloc::Allocator,
ops::{Index, IndexMut},
@@ -29,11 +29,10 @@
use hashql_core::{
debug_panic,
graph::{
- DirectedGraph, EdgeId, LinkedGraph, NodeId, Predecessors, Successors, Traverse as _,
+ DirectedGraph, EdgeId, LinkedGraph, NodeId, Predecessors, Successors,
algorithms::{Dominators, dominators},
linked::Edge,
},
- heap::CollectIn as _,
id::{
HasId as _, Id as _,
bit_vec::{BitMatrix, DenseBitSet},
@@ -243,13 +242,39 @@
where
S: Allocator + Clone,
{
- let mut topo: Vec<IslandId, _> = self
- .inner
- .depth_first_forest_post_order()
- .map(|node| IslandId::new(node.as_u32()))
- .collect_in(scratch.clone());
- topo.reverse();
+ #[expect(clippy::cast_possible_truncation)]
+ let node_count = self.node_count();
+ let mut in_degree = IslandVec::from_elem_in(0_u32, node_count, scratch.clone());
+ for (island_id, _) in self.iter_nodes() {
+ in_degree[island_id] = self.predecessors(island_id).count() as u32;
+ }
+ let mut queue: VecDeque<IslandId, _> = VecDeque::new_in(scratch.clone());
+ for (island_id, _) in self.iter_nodes() {
+ if in_degree[island_id] == 0 {
+ queue.push_back(island_id);
+ }
+ }
+
+ let mut topo = Vec::with_capacity_in(node_count, scratch.clone());
+ while let Some(island_id) = queue.pop_front() {
+ topo.push(island_id);
+
+ for successor in self.successors(island_id) {
+ in_degree[successor] -= 1;
+
+ if in_degree[successor] == 0 {
+ queue.push_back(successor);
+ }
+ }
+ }
+
+ assert_eq!(
+ topo.len(),
+ node_count,
+ "island requirement resolution requires acyclic control flow",
+ );
+
let start = self.lookup[BasicBlockId::START];
RequirementResolver::new(self, start, scratch).resolve(&topo);
@@ -379,7 +404,7 @@
}
fn resolve(mut self, topo: &[IslandId]) {
- // Iterate in reverse for topological order
+ // Iterate in topological order so dominators are processed before dependents.
for &island_id in topo {
let island = &self.graph[island_id];
let IslandKind::Exec(_) = &island.kind else {
diff --git a/libs/@local/hashql/mir/src/pass/execution/island/schedule/mod.rs b/libs/@local/hashql/mir/src/pass/execution/island/schedule/mod.rs
--- a/libs/@local/hashql/mir/src/pass/execution/island/schedule/mod.rs
+++ b/libs/@local/hashql/mir/src/pass/execution/island/schedule/mod.rs
@@ -122,6 +122,12 @@
}
}
+ assert_eq!(
+ entries.len(),
+ node_count,
+ "island schedule requires acyclic control flow",
+ );
+
entries.sort_by_key(|entry| entry.level);
IslandSchedule { entries }
}This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
5e31131 to
b32bd45
Compare
b8fe72b to
69a8667
Compare
b32bd45 to
9231d97
Compare
69a8667 to
40ba97b
Compare
40ba97b to
a4534ce
Compare



🌟 What is the purpose of this PR?
This PR implements the island dependency graph construction and scheduling system for the HashQL MIR execution engine. It builds a directed graph over computation islands, resolves data requirements between islands, and computes a topological schedule with parallelism levels for execution.
🔗 Related links
🔍 What does this change?
first_set()method toFiniteBitSet- Returns the first set bit usingtrailing_zeros(), with comprehensive test coverage for empty sets, single bits, multiple bits, and wide integral typesIdVecwith new utility methods - Addsfrom_raw(),from_domain_derive(),extend_from_slice(),append(),into_iter_enumerated(), andcopy_within()methods with detailed documentation and examplesIslandGraph) - Creates a directed graph overIslandNodes connected by three edge types:ControlFlow(execution ordering),DataFlow(data dependencies), andInherits(path inheritance between same-target dominators)IslandSchedule) - Computes topological ordering with parallelism levels using Kahn's algorithm, allowing islands at the same level to execute concurrentlyoption_into_flat_iterfeature - Adds the unstable feature flag for iterator functionalityPre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
🛡 What tests cover this?
FiniteBitSet::first_set()covering empty sets, single/multiple bits, edge cases❓ How to test this?
cargo test